NTCIR-3 CLIR Experiments at MSRA

نویسندگان

  • Hongzhao He
  • Jianfeng Gao
چکیده

This paper describes three statistical models for the purpose of resolving query translation ambiguity for cross-language information retrieval (CLIR). First, a decaying co-occurrence model is present. It is an extension of traditional co-occurrence models in that it contains a decaying factor which decreases the mutual information when the distance between the terms increases. Second, a phrase translation model is described aiming to detect and translate noun phrases that are not stored in the dictionary. Finally, a triple translation model is proposed which provides a way of exploiting linguistic dependency information. We show experimentally improvements of using these models on TREC and NTCIR corpus.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NTCIR-3 CLIR Experiments at Osaka Kyoiku University - Comparison of Gram-based Indices

Long gram-based indices are experimented at NTCIR-3 CLIR task. To make gram-based indices, no analyses such as morphological ones are required. Indices in three languages (i.e. Japanese, English and Chinese) are made at this task. They are quite different in some point. The difference of index overhead comes from the difference of character code for example.

متن کامل

Overview of CLIR Task at the Fifth NTCIR Workshop

The purpose of this paper is to overview research efforts at the NTCIR-5 CLIR task, which is a project of large-scale retrieval experiments on cross-lingual information retrieval (CLIR) of Chinese, Japanese, Korean, and English. The project has three sub-tasks, multi-lingual IR (MLIR), bilingual IR (BLIR), and single language IR (SLIR), in which many research groups from over ten countries are ...

متن کامل

Overview of CLIR Task at the Sixth NTCIR Workshop

The purpose of this paper is to overview research efforts at the NTCIR-6 CLIR task, which is a project of large-scale retrieval experiments on cross-lingual information retrieval (CLIR) of Chinese, Japanese, Korean, and English. The project has three sub-tasks, multi-lingual IR (MLIR), bilingual IR (BLIR), and single language IR (SLIR), in which many research groups from ten countries or region...

متن کامل

Overview of CLIR Task at the Third NTCIR Workshop

This report is an overview of Cross-Language Information Retrieval Task (CLIR) at the third NTCIR Workshop. There are 3 tracks in CLIR: Single Language IR (SLIR), Bilingual CLIR (BLIR), and Multilingual CLIR (MLIR). The scope, schedule, test collections, search results, relevance judgment, scoring results, and the preliminary analyses are described in the report.

متن کامل

Notes on the Limits of CLIR Effectiveness: NTCIR-2 Evaluation Experiments at Justsystem

NTCIR-2 evaluation experiments at the Justsystem site are described with a focus on comparative study of CLIR effectiveness with monolingual retrieval effectiveness of the same retrieval engine. Experiments on the effects of phrasal translation, indexing of translated phrasal terms, pre-translation feedback and parallel documents feedback in diverse retrieval settings, are reported. The results...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002